Modelling Discourse Relations for Arabic
نویسندگان
چکیده
We present the first algorithms to automatically identify explicit discourse connectives and the relations they signal for Arabic text. First we show that, for Arabic news, most adjacent sentences are connected via explicit connectives in contrast to English, making the treatment of explicit discourse connectives for Arabic highly important. We also show that explicit Arabic discourse connectives are far more ambiguous than English ones, making their treatment challenging. In the second part of the paper, we present supervised algorithms to address automatic discourse connective identification and discourse relation recognition. Our connective identifier based on gold standard syntactic features achieves almost human performance. In addition, an identifier based solely on simple lexical and automatically derived morphological and POS features performs with high reliability, essential for languages that do not have high-quality parsers yet. Our algorithm for recognizing discourse relations performs significantly better than a baseline based on the connective surface string alone and therefore reduces the ambiguity in explicit connective interpretation.
منابع مشابه
The Leeds Arabic Discourse Treebank: Annotating Discourse Connectives for Arabic
We present the first effort towards producing an Arabic Discourse Treebank, a news corpus where all discourse connectives are identified and annotated with the discourse relations they convey as well as with the two arguments they relate. We discuss our collection of Arabic discourse connectives as well as principles for identifying and annotating them in context, taking into account properties...
متن کاملModelling the Substitutability of Discourse Connectives
Processing discourse connectives is important for tasks such as discourse parsing and generation. For these tasks, it is useful to know which connectives can signal the same coherence relations. This paper presents experiments into modelling the substitutability of discourse connectives. It shows that substitutability effects distributional similarity. A novel variancebased function for compari...
متن کاملTranslating English Discourse Connectives into Arabic: a Corpus-based Analysis and an Evaluation Metric
Discourse connectives can often signal multiple discourse relations, depending on their context. The automatic identification of the Arabic translations of seven English discourse connectives shows how these connectives are differently translated depending on their actual senses. Automatic labelling of English source connectives can help a machine translation system to translate them more corre...
متن کاملUnderspecified Modelling of Complex Discourse Constraints
We introduce a new type of discourse constraints for the interaction of discourse relations with the configuration of discourse segments. We examine corpus-extracted examples as soft constraints. We show how to use Regular Tree Gramamrs to process such constraints, and how the representation of some constraints depends on the expressive power of this formalism.
متن کاملComputational Approaches to Arabic Script - based Languages
Discourse connectives can often signal multiple discourse relations, depending on their context. The automatic identification of the Arabic translations of seven English discourse connectives shows how these connectives are differently translated depending on their actual senses. Automatic labelling of English source connectives can help a machine translation system to translate them more corre...
متن کامل